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Introduction 

The  linear  mixed-effects  models  (MIXED)  procedure  in  SPSS  enables  you  to  fit  linear  mixed-effects  models  to  data  sampled 
from  normal  distributions.  Recent  texts,  such  as  those  by  McCulloch  and  Searle  (2000)  and  Verbeke  and  Molenberghs 
(2000),  comprehensively  review  mixed-effects  models.  The  MIXED  procedure  fits  models  more  general  than  those  of  the 
general  linear  model  (GLM)  procedure  and  it  encompasses  all  models  in  the  variance  components  (VARCOMP)  procedure. 
This  report  illustrates  the  types  of  models  that  MIXED  handles.  We  begin  with  an  explanation  of  simple  models  that  can  be 
fitted  using  GLM  and  VARCOMP,  to  show  how  they  are  translated  into  MIXED.  We  then  proceed  to  fit  models  that  are  unique 
to  MIXED. 


The  major  capabilities  that  differentiate  MIXED  from  GLM  are  that  MIXED  handles  correlated  data  and  unequal  variances. 
Correlated  data  are  very  common  in  such  situations  as  repeated  measurements  of  survey  respondents  or  experimental 
subjects.  MIXED  extends  repeated  measures  models  in  GLM  to  allow  an  unequal  number  of  repetitions.  It  also  handles  more 
complex  situations  in  which  experimental  units  are  nested  in  a  hierarchy.  MIXED  can,  for  example,  process  data  obtained 
from  a  sample  of  students  selected  from  a  sample  of  schools  in  a  district. 

In  a  linear  mixed-effects  model,  responses  from  a  subject  are  thought  to  be  the  sum  (linear)  of  so-called  fixed  and  random 
effects.  If  an  effect,  such  as  a  medical  treatment,  affects  the  population  mean,  it  is  fixed.  If  an  effect  is  associated  with  a 
sampling  procedure  (e.g.,  subject  effect),  it  is  random.  In  a  mixed-effects  model,  random  effects  contribute  only  to  the 
covariance  structure  of  the  data.  The  presence  of  random  effects,  however,  often  introduces  correlations  between  cases  as 
well.  Though  the  fixed  effect  is  the  primary  interest  in  most  studies  or  experiments,  it  is  necessary  to  adjust  for  the  covariance 
structure  of  the  data.  The  adjustment  made  in  procedures  like  GLM-Univariate  is  often  not  appropriate  because  it  assumes 
independence  of  the  data. 

The  MIXED  procedure  solves  these  problems  by  providing  the  tools  necessary  to  estimate  fixed  and  random  effects  in  one 
model.  MIXED  is  based,  furthermore,  on  maximum  likelihood  (ML)  and  restricted  maximum  likelihood  (REML)  methods,  versus 
the  analysis  of  variance  (ANOVA)  methods  in  GLM.  ANOVA  methods  produce  an  optimum  estimator  (minimum  variance)  for 
balanced  designs,  whereas  ML  and  REML  yield  asymptotically  efficient  estimators  for  balanced  and  unbalanced  designs.  ML 
and  REML  thus  present  a  clear  advantage  over  ANOVA  methods  in  modeling  real  data,  since  data  are  often  unbalanced.  The 
asymptotic  normality  of  ML  and  REML  estimators,  furthermore,  conveniently  allows  us  to  make  inferences  on  the  covariance 
parameters  of  the  model,  which  is  difficult  to  do  in  GLM. 


Data  preparation  for  MIXED 

Many  datasets  store  repeated  observations  on 
a  sample  of  subjects  in  “one  subject  per  row” 
format.  MIXED,  however,  expects  that 
observations  from  a  subject  are  encoded  in 
separate  rows.  To  illustrate,  we  select  a  subset 
of  cases  from  the  data  that  appear  in  Potthoff 
and  Roy  (1964).  The  data  shown  in  Figure  1 
encode,  in  one  row,  three  repeated  measurements 
of  a  dependent  variable  (“distl”  to  “dist3”)  from 
a  subject  observed  at  different  ages  (“agel”  to 
‘‘age3”). 


Figure  1.  MIXED, 
however,  requires 
that  measurements  at 
different  ages  be  collapsed 
into  one  variable,  so  that 
each  subject  has  three  cases. 
The  Data  Restructure  Wizard 
in  SPSS  simplifies  the  tedious 
data  conversion  process.  We 
choose  “Data->Restructure” 
from  the  pull-down  menu, 
and  select  the  option 
“Restructure  selected 
variables  into  cases.”  We 
then  click  the  “Next”  button 
to  reach  the  dialog  shown 
in  Figure  2. 
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Restructure  Data  wizard  -  Sf9p  2  of  7 

Variables  to  Cases;  Number  of  Variable  Groups 

You  have  chosen  to  restructure  selected  variables  into  groups  of  related  cases  in  the  nm  Be. 

A  A  group  of  related  variables,  caled  a  variable  group,  represents  measurements  on  one  variable. 

For  exam^,  the  variable  may  be  width.  If  it  is  recorded  in  three  separate  measurements,  each  one 
represent  a  different  port  in  time  *w1 ,  w2,  and  w3,  then  the  data  are  ananged  in  a  group  of  variables. 

If  there  b  more  than  one  variable  in  the  Be  often  it  is  also  recorded  m  a  variable  group,  for  eKampIo  height 
recorded  in  hi,  h2,  and  hi 


Figure  2.  We  need  to  convert  two  groups  of 
variables  (“age”  and  “dist”)  into  cases.  We 
therefore  enter  “2”  and  click  “Next.”  This 
brings  us  to  the  “Select  Variables”  dialog  box. 
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Figure  3.  In  the  “Select  Variables”  dialog  box, 
we  first  specify  “Subject  ID  [subidj”  as  the  case 
group  identification.  We  then  enter  the  names 
of  new  variables  in  the  target  variable  drop¬ 
down  list.  For  the  target  variable  “age,”  we  drag 
“agel,”  “age2,”  and  “age3”  to  the  list  box  in  the 
“Variables  to  be  Transposed”  group.  We  similarly 
associate  variables  “distl,”  “dist2,”  and  “dist3” 
with  the  target  variable  “distance.”  We  then  drag 
variables  that  do  not  vary  within  a  subject  to  the 
“Fixed  Variable(s)”  box.  Clicking  “Next”  brings 
us  to  the  “Create  Index  Variables”  dialog  box. 

We  accept  the  default  of  one  index  variable,  then 
click  “Next”  to  arrive  at  the  final  dialog  box. 
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Figure  4.  In  the  “Create  One  Index  Variable” 
dialog  box,  we  enter  “visit”  as  the  name  of 
the  indexing  variable  and  click  “Finish.” 


Figure  5.  We  now  have  three  cases  for 
each  subject. 
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We  can  also  perform  the  conversion  using  the  following  command  syntax: 


VARSTOCASES 

/MAKE  age  FROM  agel  age2  age3 
/MAKE  distance  FROM  distl  dist2  dist3 
/INDEX  =  visit(3) 

/KEEP  =  subid  gender. 

The  command  syntax  is  easy  to  interpret— it  collapses  the  three  age  variables  into  “age”  and  the  three  response  variables 
into  “distance.”  At  the  same  time,  a  new  variable,  “visit,”  is  created  to  index  the  three  new  cases  within  each  subject.  The 
last  subcommand  means  that  the  two  variables  that  are  constant  within  a  subject  should  be  kept. 

Fitting  fixed-effects  models 
With  iid  residual  errors 

A  fitted  model  has  the  form  ,  wherey  is  a  vector  of  responses,  X  is  the  fixed-effects  design  matrix,  P  is  a 

vector  of  fixed-effects  parameters  and  E  is  a  vector  of  residual  errors.  In  this  model,  we  assume  that  E  is  distributed  as 
R  =  Q“I  where  R  is  an  unknown  covariance  matrix.  A  common  belief  is  that  R  =  .  We  can  use  GLM  or  MIXED  to 

fit  a  model  with  this  assumption.  Using  a  subset  of  the  growth  study  dataset,  we  illustrate  how  to  use  MIXED  to  fit  a  fixed- 
effects  model.  The  following  command  (Example  1)  fits  a  fixed-effects  model  that  investigates  the  effect  of  the  variables 
“gender”  and  “age”  on  “distance,”  which  is  a  measure  of  the  growth  rate. 

Example  1:  Fixed -effects  model  using  MIXED 

Command  syntax: 

MIXED  DISTANCE  BY  GENDER  WITH  AGE 
/FIXED  =  GENDER  AGE  |  SSTYPE(3) 

/PRINT  =  SOLUTION  TESTCOV. 

Output: 


Type  III  Tests  of  Fixed  Effects  ^ 


Source 

Numerator  df 

Denominator 

df 

F 

Sio. 

Intercept 

1 

27,000 

38.3S6 

.000 

gender 

1 

27 

7.G21 

.010 

age 

1 

27.000 

1 1 .040 

.003 

Dependent  Variable:  Distance  (mtn)  from  center  of  pituitary  t 
pteryo-maxillary  fissure. 


Figure  6 


Estimates  of  Fixed  Effects  ^ 


Parameter 

Estimate 

Std. 

Emor 

df 

t 

Sic. 

95%  Con 

Inter 

fidence 

k/al 

Lower 

Bound 

Upper 

Bound 

Intercept 

17,050 

2.620 

?7.000 

6.507 

.000 

11.673 

22,427 

[gender^^F] 

-1 .333 

.700 

^7.000 

-2.761 

.010 

-3.370 

-.496 

[gender=:M] 

.000^ 

.000 

. 

age 

.713 

.214 

=7.000 

3.323 

.003 

.273 

1.152 

This  parameter  is  set  to  zero  because  it  is  redundant. 

Dependent  Variable:  Distance  (mm)  from  center  of  pituitary  to  pteryo-maxillary 
fissure* 


Figure  7 
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The  command  in  Example  1  produces  a  “Type  III  Tests 
of  Fixed  Effects”  table  (Figure  6).  Both  “gender”  and 
“age”  are  significant  at  the  .05  level.  This  means 
that  “gender”  and  “age”  are  potentially  impor¬ 
tant  predictors  of  the  dependent  variable.  More 
detailed  information  on  fixed-effects  parameters 
may  be  obtained  by  using  the  subcommand  /PRINT 
SOLUTION.  The  “Estimates  of  Fixed  Effects”  table 
(Figure  7)  gives  estimates  of  individual  parameters, 


Estimates  of  Covariance  Parameters  ^ 


Parameter 

Estimate 

Std. 

Error 

WaldZ 

Sia 

95%  Confidence 
Interval 

Lower 

Bound 

Upper 

Bound 

Residua! 

3,679 

1.001 

3,674 

.000 

2.158 

6.271 

Dependent  Variable:  Distance  (mm)  from  center  of  pituitary  to 
pteryo-maxillarY  fissure. 


Figure  8 


as  well  as  their  standard  errors  and  confidence  intervals. 


We  can  see  that  the  mean  distance  for  males  is  larger  than  that  for  females.  Distance,  moreover,  increases  with  age.  MIXED 
also  produces  an  estimate  of  the  residual  error  variance  and  its  standard  error.  The  /PRINT  TESTCOV  option  gives  us  the  Wald 
statistic  and  the  confidence  interval  for  the  residual  error  variance  estimate. 


Example  1  is  simple— users  familiar  with  the  GLM  procedure  can  fit  the  same  model  using  GLM. 
Example  2:  Fixed -effects  model  using  GLM 
Command  syntax: 

GLM  DISTANCE  BY  GENDER  WITH  AGE 
/METHOD  =  SSTYPEO) 

/PRINT  =  PARAMETER 
/DESIGN  =  GENDER  AGE. 


Output: 


Tests  of  Between-Subjects  Effects 


Dependent  Variable:  Distance  fir 

m)  from  cer 

Iter  of  Pituitary  to  ptervo-m 

axlllarv  fissur 

Source 

Type  III  Sum 
of  Squares 

df 

Mean  Square 

F 

Sia 

Corrected  Model 

Intercept 

gender 

age 

Error 

Total 

Corrected  Total 

68.646^ 
141 .095 

28.033 

40,613 

99.321 

13372.000 

167,967 

2 

1 

1 

1 

27 

30 

29 

34.323 

141.095 

28.033 

40.61  3 

3.679 

9.331 

38.356 

7.621 

1 1 ,040 

.001 

.000 

.010 

,003 

R  Squared  =*  .409  (Adjusted  R  Squared  =  .365) 


Figure  9 
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We  see  in  Figure  9  that  GLM  and  MIXED 
produced  the  same  Type  III  tests  and 
parameter  estimates.  Note,  however, 
that  in  the  MIXED  “Type  III  Tests  of  Fixed 
Effects”  table  (Figure  6),  there  is  no 
column  for  the  sum  of  squares.  This  is 
because,  for  some  complex  models, 
the  test  statistics  in  MIXED  may  not  be 
expressed  as  a  ratio  of  two  sums  of 
squares.  They  are  thus  omitted  from  the 
ANOVA  table. 


Parameter  Estimates 


Dependent \ 

Variable:  Oist 

ance  fmml  f 

rom  center  of  pituitary  t< 

D  Pteryo-maxillary  fissure 

Parameter 

B 

Std.  Error 

t 

Sio. 

95%  Cor 
Intel 

if  i  dance 
val 

Lower 

Bound 

Upper 

Bound 

Intercept 

[gender^F] 

[gend^=M] 

age 

17.050 

-1.933 

0^ 

.713 

2.620 

.700 

.214 

6.507 

-2.761 

3.323 

,000 

.010 

,003 

11,673 

-3.370 

.273 

22.427 

-.496 

1.152 

This  parameter  is  set  to  zero  because  it  is  redundant* 


Figure  10 


With  non-iid  residual  errors 

The  assumption  may  be  violated  in  some  situations.  This  often  happens  when  repeated  measurements  are  made  on  each 
subject.  In  the  growth  study  dataset,  for  example,  the  response  variable  of  each  subject  is  measured  at  various  ages.  We 
may  suspect  that  error  terms  within  a  subject  are  correlated.  A  reasonable  choice  of  the  residual  error  covariance  will  therefore 
be  a  block  diagonal  matrix,  where  each  block  is  a  first-order  autoregressive  (ARl)  covariance  matrix. 

Example  3:  Fixed- effects  model  with  correlated  residual  errors 

Command  syntax: 

MIXED  DISTANCE  BY  GENDER  WITH  AGE 
/FIXED  GENDER  AGE 

/REPEATED  VISIT  |  SUBjECT(SUBID)  C0VTYPE(AR1) 

/PRINT  SOLUTION  TESTCOV  R. 


Output: 


Type  III  Tests  of  Fixed  Effects  * 


Source 

Numerator  df 

Denominator 

df 

F 

Sio. 

Intercept 

1 

25.723 

75.036 

.000 

gender 

1 

8.701 

3.702 

.088 

age 

1 

23.687 

22.772 

.000 

Dependent  Variable:  Distance  (mm)  from  center  of  pituitary  t 
pteryo-maxillary  fissure. 


Figure  11 


Estimates  of  Fixed  Effects  ** 


Parameter 

Estimate 

Std. 

Error 

df 

t 

Sio. 

95%  Ca 

Intfi 

nfidence 

l^al 

Lower 

Bound 

LUpper 

Bound 

Intcnc^ept 

1 7.243 

1*947 

26.760 

8.857 

.000 

13.246 

21.239 

[gender=F] 

-2*072 

1*077 

8.701 

-1 .924 

.088 

-4.522 

.377 

[gender^  M] 

.000* 

*000 

age 

.713 

*149 

23.687 

4.772 

.000 

.404 

1.021 

This  parametef  is  set  to  zero  because  it  is  redundant* 

Dependent  Variable:  Distance  (mm)  from  center  of  pituitary  to  pteryo-maxillary 
flssLffe. 


Figure  12 
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Figure  13 


Estimates  of  Covariance  Parameters  ^ 


Parameter 

Estimate 

Std, 

Error 

WaldZ 

Sio. 

9S%Ca 

Inte 

ifidence 

p/al 

Lower 

Bound 

Upper 

Bound 

Repeated  Measures  AR1  diagonal 
ARl  rho 

3,809 

.729 

1.467 

.120 

2,597 

6.072 

,009 

.000 

1,791 

.401 

8.101 

.892 

Dependent  Variable:  Distance  (mm)  from  center  of  pituitary  to  pteryo-maKillarv  fissure. 


Example  3  uses  the  /REPEATED  subcommand  to  specify  a  more  general 
covariance  structure  for  the  residual  errors.  Since  there  are  three 
observations  per  subject,  we  assume  that  the  set  of  three  residual 
errors  for  each  subject  is  a  sample  from  a  three-dimensional  normal 
distribution  with  a  first-order  autoregressive  (ARl)  covariance  matrix. 
Residual  errors  within  each  subject  are  therefore  correlated,  but  are 
independent  across  subjects.  The  MIXED  procedure,  by  default,  uses 
the  REML  method  to  estimate  the  covariance  matrix.  An  alternative  is 
to  request  ML  estimates  by  using  the  /METHOD=ML  subcommand. 

The  command  syntax  in  Example  3  also  produces  the  “Residual  Covariance  (R)  Matrix”  (Figure  14),  which  shows  the  estimated 
covariance  matrix  of  the  residual  error  for  one  subject.  We  see  from  the  “Estimates  of  Covariance  Parameters”  table  (Figure  13) 
that  the  correlation  parameter  has  a  relatively  large  value  (.729)  and  that  the  p-value  of  the  Wald  test  is  less  than  .05.  The 
autoregressive  structure  may  fit  the  data  better  than  the  model  in  Example  1. 

We  also  see  that,  for  the  tests  of  fixed  effects,  the  denominator  degrees  of  freedom  are  not  integers.  This  is  because  these 
statistics  do  not  have  exact  F  distributions.  The  values  for  denominator  degrees  of  freedom  are  obtained  by  a  Satterthwaite 
approximation.  We  see  in  the  new  model  that  gender  is  not  significant  at  the  .05  level.  This  demonstrates  that  ignoring  the 
possible  correlations  in  your  data  may  lead  to  incorrect  conclusions.  MIXED  is  therefore  usually  a  better  alternative  to  GLM 
and  VARCOMP  when  data  are  correlated. 

Fitting  simple  mixed -effects  models 
Balanced  design 

MIXED,  as  its  name  implies,  handles  complicated  models  that  involve  fixed  and  random  effects.  Levels  of  an  effect  are,  in 
some  situations,  only  a  sample  of  all  possible  levels.  If  we  want  to  study  the  efficiency  of  workers  in  different  environments, 
for  example,  we  don’t  need  to  include  all  workers  in  the  study— a  sample  of  workers  is  usually  enough.  The  worker  effect 
should  be  considered  random,  due  to  the  sampling  process.  A  mixed-effects  model  has,  in  general,  the  form  V  =  X&+Zy+6 
where  the  extra  term  Zy  models  the  random  effects.  ^  is  the  design  matrix  of  random  effects  and  Y  is  a  vector  of  random- 
effects  parameters.  We  can  use  GLM  and  MIXED  to  fit  mixed-effects  models.  MIXED,  however,  fits  a  much  wider  class  of 
models.  To  understand  the  functionality  of  MIXED,  we  first  look  at  several  simpler  models  that  can  be  created  in  MIXED  and 
GLM.  We  also  look  at  the  similarity  between  MIXED  and  VARCOMP  in  these  models. 


Residual  Covariance  (R)  Matrix  * 


[visit  ■  1 1 

r  visit  «  21 

[visit  -  31 

[visit  -  1 ; 

3.809 

2.778 

2.026 

[visit  =  2] 

2.778 

3.809 

2.778 

[visit  =  3] 

2.026 

2.778 

3.809 

First-Order  Autorearessive 


Dependent  Variable:  Distance  (mm)  from 
center  of  pituitary  to  pteryo-maxillary  fissure 


Figure  14 
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In  examples  4  through  6,  we  use  a  semiconductor  dataset  that  appeared  in  Pinheiro  and  Bates  (2000)  to  illustrate  the  similarity 
between  GLM,  MIXED,  and  VARCOMP.  The  dependent  variable  in  this  dataset  is  “current”  and  the  predictor  is  “voltage.”  The 
data  are  collected  from  a  sample  often  silicon  wafers.  There  are  eight  sites  on  each  wafer  and  five  measurements  are  taken 
at  each  site.  We  have,  therefore,  a  total  of  400  observations  and  a  balanced  design. 

Example  4:  Simple  mixed- effects  model  with  balanced  design  using  MIXED 

Command  syntax: 

MIXED  CURRENT  BY  WAFER  WITH  VOLTAGE 
/FIXED  VOLTAGE  |  SSTYPE(3) 

/RANDOM  WAFER 
/PRINT  SOLUTION  TESTCOV. 

Output: 


Type  III  Tests  of  Fixed  Effects  ^ 


Source 

Numerator  df 

Denominator 

df 

F 

Sio. 

Intercept 

1 

16*559 

3774.499 

*000 

voltage 

1 

389*000 

7958.177 

*000 

3-  Dependent  Variable:  current* 


Figure  15 


Estimates  of  Fixed  Effects  * 


Parameter 

Estimate 

Std.  Error 

df 

t 

Sia. 

95%  Confider 

ce  Interval 

Lower 

Bound 

Upper 

Bound 

Intercept 

voltage 

7.082868 

9.648660 

.1 1  5287 
.037012 

16.559 

389.000 

-61.437 

260.688 

.000 

.000 

-7.326596 

9.575890 

5.839139 

5.721429 

Dependent  Variable:  current. 


Figure  16 


Estimates  of  Covariance  Parameters  ® 


95%  Confidence 

Inter 

Std. 

Lower 

Upper 

Parameter 

Estimate 

Error 

WaldZ 

Sia* 

Bound 

Bound 

Residual 

*175 

.013 

13*946 

.000 

.152 

.202 

wafer  Variance 

*093 

,046 

2*026 

,043 

,036 

.246 

Dependent  Variable:  current. 


Figure  17 
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Example  5:  Simple  mixed-effects  model  with  balanced  design  using  GLM 


Command  syntax: 


GLM  CURRENT  BY  WAFER  WITH  VOLTAGE 
/RANDOM  =  WAFER 
/METHOD  =  SSTYPEO) 

/PRINT  =  PARAMETER 
/DESIGN  =  WAFER  VOLTAGE. 


Output: 


Tests  of  Between-Subjects  Effects 


Deoendent  Variable:  cur 

rent 

Source 

Type  III  Sum 
of  Sauares 

df 

Mean  Sauare 

F 

Sio. 

Intercept 

Hypothesis 

2229.645 

1 

2229.645 

3774.499 

.000 

Error 

9.782 

16.56 

.591® 

wafer 

Hypothesis 

35.223 

9 

3.914 

22.319 

.000 

Error 

68.211 

389 

.175'> 

voltage 

Hypothesis 

11916.369 

1 

11916.369 

7958.177 

.000 

Error 

68.211 

389 

.175‘> 

3-  .1 1 1  MS(wafer)  +  .889  MS(EiTor) 
MS(  Error) 


Figure  18 


Expected  Mean  Squares 


Source 

Vari 

price  Cornoo 

lent 

Var^  wafer! 

Varf  Error! 

Quadratic 

Term 

Intercept 

wafer 

voltage 

Error 

4.444 

40.000 

.000 

.000 

o  o  o  o 
o  o  o  o 
o  o  o  o 

Intercept 

voltage 

For  each  source,  the  expected  mean  square 
equals  the  sum  of  the  coefficients  in  the  cells 
times  the  variance  components,  plus  a  quadratic 
term  involving  effects  in  the  Quadratic  Term  cell. 

Expected  Mean  Squares  are  based  on  the  Type  III 

Figure  19 


Example  6:  Variance  components  model  with  balanced  design 
Command  syntax: 

VARCOMP  CURRENT  BY  WAFER  WITH  VOLTAGE 
/RANDOM  =  WAFER 
/METHOD  =  REML. 

Output: 


Variance  Estimates 


Comoonent 

Estimate 

Var(  wafer) 
Var{  Error) 

.093 

.175 

Dependent  Variable:  current 

Method:  Restricted  Maximum  Likelihood  Estimatior 


Figure  20 


In  Example  4,  “voltage”  is  entered  as  a  fixed  effect  and  “wafer”  is 
entered  as  a  random  effect.  This  example  tries  to  model  the  relationship 
between  “current”  and  “voltage”  using  a  straight  line,  but  the  intercept 
of  the  regression  line  will  vary  from  wafer  to  wafer  according  to  a  normal 
distribution.  In  the  Type  III  tests  for  “voltage,”  we  see  a  significant 
relationship  between  “current”  and  “voltage.”  If  we  delve  deeper  into  the 
parameter  estimates  table,  the  regression  coefficient  of  “voltage”  is  9.65. 
This  indicates  a  positive  relationship  between  “current”  and  “voltage.” 

In  the  “Estimates  of  Covariance  Parameters”  table  (Figure  17),  we  have 
estimates  for  the  residual  error  variance  and  the  variance  due  to  the 
sampling  of  wafers. 
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We  repeat  the  same  model  in  Example  5  using  GLM,  Note  that  MIXED  produces  Type  III  tests  for  fixed  effects  only,  but  GLM 
includes  fixed  and  random  effects.  GLM  treats  all  effects  as  fixed  during  computation  and  constructs  F  statistics  by  taking  the 
ratio  of  the  appropriate  sums  of  squares.  Mean  squares  of  random  effects  in  GLM  are  estimates  of  functions  of  the  variance 
parameters  of  random  and  residual  effects.  These  functions  can  be  recovered  from  “Expected  Mean  Squares”  (Figure  19).  In 
MIXED,  the  outputs  are  much  simpler  because  the  variance  parameters  are  estimated  directly  using  ML  or  REML.  As  a  result, 
there  are  no  random-effect  sums  of  squares. 

When  we  have  a  balanced  design,  as  in  examples  4  through  6,  the  tests  of  fixed  effects  are  the  same  for  GLM  and  MIXED.  We 
can  also  recover  the  variance  parameter  estimates  of  MIXED  by  using  the  sum  of  squares  in  GLM.  in  MIXED,  for  example,  the 
estimate  of  the  residual  variance  is  0.175,  which  is  the  same  as  the  MS(Error)  in  GLM.  The  variance  estimate  of  random  effect 
“wafer”  is  0.093,  which  can  be  recovered  in  GLM  using  the  “Expected  Mean  Squares”  table  (Figure  19)  in  Example  5: 

Var(WAFER)  =  [MS(WAFER)-MS(Error)]/40  =  0.093 

This  is  equal  to  MiXED’s  estimate.  One  drawback  of  GLM,  however,  is  that  you  cannot  compute  the  standard  error  of  the 
variance  estimates. 

VARCOMP  is,  in  fact,  a  subset  of  MIXED.  These  two  procedures  therefore  always  provide  the  same  variance  estimates,  as  seen 
in  examples  4  and  6.  VARCOMP  only  fits  relatively  simple  models.  It  can  only  handle  random  effects  that  are  iid.  No  statistics 
on  fixed  effects  are  produced,  if  your  primary  objective  is  to  make  inferences  about  fixed  effects  and  your  data  are  correlated, 
MIXED  is  a  better  choice. 

An  important  note:  Due  to  the  different  estimation  methods  that  are  used,  GLM  and  MIXED  often  do  not  produce  the  same 
results.  The  next  section  gives  an  example  of  situations  in  which  they  produce  different  results. 

Unbalanced  design 

One  situation  in  which  MIXED  and  GLM  disagree  is  with  an  unbalanced  design.  To  illustrate  this,  we  removed  some  cases  in 
the  semiconductor  dataset,  so  that  the  design  is  no  longer  balanced. 
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We  then  rerun  examples  4  through  6  with  this  unbalanced  dataset.  The  output  is  shown  in  examples  4a  through  6a.  We  want 
to  see  whether  the  three  methods— GLM,  MIXED  and  VARCOMP— still  agree  with  each  other. 

Example  4a:  Mixed-effects  model  with  unbalanced  design  using  MIXED 

Command  syntax: 

MIXED  CURRENT  BY  WAFER  WITH  VOLTAGE 
/FIXED  VOLTAGE  |  SSTYPE(3) 

/RANDOM  WAFER 
/PRINT  SOLUTION  TESTCOV. 


Output: 


Type  III  Tests  of  Fixed  Effects  * 


Sourrft 

Numerator  df 

Denominator 

df 

F 

Sia. 

Intercept 

1 

16.495 

3709.960 

.000 

voltage 

1 

385.037 

7481.118 

.000 

Dependent  Variable:  current. 


Figure  22 


Estimates  of  Fixed  Effects  ® 

Parameter 

Estimate 

Std. 

Error 

df 

t 

Sio. 

95%  Confidence 

Interval 

Lower 

Bound 

Upper 

Bound 

Intercept 

voltage 

-7.098 

9,656 

.117 

.037 

16.495 

385.037 

-60.909 

Z59.771 

.000 

.000 

-7.344 

9.583 

-7.093 

9.730 

Dependent  Variable:  current. 

Figure  23 


Estimates  of  Covariance  Parameters  ^ 


Parameter 

Estimate 

Std, 

Error 

Wald  2 

Sio. 

95%  Coi 
Inte 

nfidence 

rval 

Lower 

Bound 

Upper 

Bound 

Residual 

wafer  Variance 

.174451 

.095725 

.013 

.047 

13.874 

2.027 

-000 

.043 

.151 

.036 

.201 

.252 

a.  Dependent  Variable:  current. 


Figure  24 


Example  5a:  Mixed-effects  model  with  unbalanced  design  using  GLM 


Command  syntax: 


GLM  CURRENT  BY  WAFER  WITH  VOLTAGE 
/RANDOM  =  WAFER 
/METHOD  =  SSTYPEO) 

/PRINT  =  PARAMETER 
/DESIGN  =  WAFER  VOLTAGE. 
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Output: 


Test 

Deoendent  Variable;  cur 

:s  of  Between- Subjects  Effects 

rent 

Type  III  Sum 

Source 

of  Souares 

df 

Mean  Sou  are 

F 

Sio. 

Intercept  Hypothesis 

2193.281 

1 

2193.281 

3724.816 

.000 

Error 

9.746 

16.551 

.589^ 

wafer  Hypothesis 

35.495 

9 

3.944 

22.607 

.000 

Error 

67.163 

385 

.1  74b 

voltage  Hypothesis 

11772,307 

1 

1 1  772.307 

7482.629 

.000 

Error 

67.163 

385 

.174'' 

.1  lOMS(wafer)  +  .890  MS(Error) 
MS(  Error) 


Figure  25 


Expected  Mean  Squares 


Source 

Vari 

ance  Com  do 

pent 

Var(  wafer) 

Varf  Error) 

Quadratic 

Term 

Intercept 

wafer 

voltage 

Error 

4.352 

39.591 

,000 

.000 

0000 

0000 

0000 

Intercept 

voltage 

For  each  source^  the  expected  mean  square 
equals  the  sum  of  the  coeffiderits  in  the  cells 
times  the  variance  components^  plus  a  quadratic 
term  involvinq  effects  in  the  Quadratic  Term  cdl. 

Expected  Mean  Squares  are  based  on  the  Type  III 
Sums  of  Squares. 


Figure  26 


Example  6a:  Variance  components  model  with  unbalanced  design 


Command  syntax: 


VARCOMP  CURRENT  BY  WAFER  WITH  VOLTAGE 
/RANDOM  =  WAFER 
/METHOD  =  REML 


Output: 


Variance  Estimates 


Comoonent 

Estimate 

Var(  wafer) 
Var(  Error) 

.0957247 

.1744505 

Dependent  Variable;  current 

Method:  Restricted  Maxirrium  Lilcelihood  Estimatior 


Figure  27 


Since  the  data  have  changed,  we  expect  examples  4a  through  6a  to  differ 
from  examples  4  through  6.  We  will  focus  instead  on  whether  examples  4a, 
5a,  and  6a  agree  with  each  other. 

In  Example  4a,  the  F  statistic  for  the  “voltage”  effect  is  67481.118,  but 
Example  5a  gives  an  F  statistic  value  of  67482.629.  Apart  from  the  test  of 
fixed  effects,  we  also  see  a  difference  in  covariance  parameter  estimates. 


Examples  4a  and  6a,  however,  show  that  VARCOMP  and  MIXED  can  produce  the  same  variance  estimates,  even  in  an 
unbalanced  design.  This  is  because  MIXED  and  VARCOMP  offer  maximum  likelihood  or  restricted  maximum  likelihood 
methods  in  estimation,  while  GLM  estimates  are  based  on  the  method-of-moments  approach. 


MIXED  is  generally  preferred  because  it  is  asymptotically  efficient  (minimum  variance),  whether  or  not  the  data  are  balanced, 
GLM,  however,  only  achieves  its  optimum  behavior  when  the  data  are  balanced. 
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Fitting  mixed-effects  models 


With  subjects 

In  the  semiconductor  dataset,  “current”  is  a  dependent  variable  measured  on  a  batch  of  wafers.  These  wafers  are  therefore 
considered  subjects  in  a  study.  An  effect  of  interest  (such  as  “site”)  may  often  vary  with  subjects  (“wafer”).  One  scenario  is 
that  the  (population)  means  of  “current”  at  separate  sites  are  different.  When  we  look  at  the  current  measured  at  these  sites 
on  individual  wafers,  however,  they  hover  below  or  above  the  population  mean  according  to  some  normal  distribution.  It  is 
therefore  common  to  enter  an  “effect  by  subject”  interaction  term  in  a  GLM  or  MIXED  model  to  account  for  the  subject  variations, 

In  the  dataset  there  are  eight  sites  and  ten  wafers.  The  site*wafer  effect,  therefore,  has  80  parameters,  which  can  be  denoted 
by  Ti|,  i=1...10  and  j=1...8.  A  common  assumption  is  that  y.i’s  are  assumed  to  be  iid  normal  with  zero  mean  and  an 
unknown  variance.  The  mean  is  zero  because  Yii’s  are  used  to  model  only  the  population  variation.  The  mean  of  the 
population  is  modeled  by  entering  “site”  as  a  fixed  effect  in  GLM  and  MIXED.  The  results  of  this  model  for  MIXED  and  GLM 
are  shown  in  examples  7  and  8. 

Example  7:  Fitting  random  effect^subject  interaction  using  MIXED 
Command  syntax: 

MIXED  CURRENT  BY  WAFER  SITE  WITH  VOLTAGE 
/FIXED  SITE  VOLTAGE  |SSTYPE(3) 

/RANDOM  SITE*WAFER  |  COVTYPE(ID). 

Output: 


Type  III  Tests  of  Fixed  Effects  * 


Source 

Numerator  df 

Denominator 

df 

F 

Sia. 

Intercept 

1 

329.796 

0467.974 

.000 

site 

7 

72.000 

1.140 

.348 

voltage 

1 

319.000 

6639.444 

.000 

Dependent  Variable:  current. 


Figure  28 


Estimates  of  Covariance  Parameters  ® 


Parameter 

Estimate 

Std. 

Error 

WaldZ 

Sia. 

95%  Coi 
Inte 

ifidence 

rval 

Lower 

Bound 

Upper 

Bound 

Residual 

site  *  wafer  Variance 

.155 

.104 

.012 

.023 

1 2.629 
4.586 

.000 

.000 

.133 

.068 

.182 

.159 

Dependent  Variable:  current. 


Figure  29 
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Example  8:  Fitting  random  effect^s abject  interaction  using  GLM 


Command  syntax: 


GLM  CURRENT  BY  WAFER  SITE  WITH  VOLTAGE 
/RANDOM  =  WAFER 
/METHOD  =  SSTYPEO) 

/DESIGN  =  SITE  SITE*WAFER  VOLTAGE. 


Output: 


Tests  of  Between-Subjects  Effects 


Source 

Type  III  Sum 
of  Sou  ares 

df 

Mean 

Sauare 

F 

Sia. 

Intercept 

Hypothesis 

2229.645 

1 

2229.645 

0467.974 

.000 

Error 

70.246 

329.796 

.213^ 

site 

Hypothesis 

5.371 

7 

.767 

1.140 

.348 

Error 

48.462 

72 

.673^* 

wafer  * 

Hypothesis 

48.462 

72 

.673 

4.329 

.000 

site 

Error 

49,600 

319 

.155" 

voltage 

Hypothesis 

11916369 

1 

1916369 

6639.444 

.000 

Error 

49.600 

319 

.155" 

.m  MS(wafer  *  site)  +  ,889  MSCError) 
MSfwafer*  site) 

MSt  Error) 


Figure  30 


Expected  Mean  Squares 


Varia 

nee  Comoor] 

ent 

Source 

Var(  wafer  * 
site) 

Varf  Error) 

Quadratic 

Term 

Intercept 

.556 

1.000 

Intercept, 

site 

site 

5.000 

1.000 

site 

wafer  *  site 

5.000 

1.000 

voltage 

.000 

1.000 

voltage 

Error 

.000 

1.000 

For  each  source,  the  expected  mean  square 
equals  the  sum  of  the  coefficients  in  the  cells 
times  the  variance  components,  plus  a  quadratic 
term  involvinq  effects  in  the  Quadratic  Term  cell. 

Expected  Mean  Squares  are  based  on  the  Type  III 
Sums  of  Squares. 


Figure  31 


Since  the  design  is  balanced,  the  results  of  GLM  and  MIXED  in  examples  7  and  8  match.  This  is  similar  to  examples  4  and  5. 
We  see  from  the  results  of  Type  III  tests  that  “voltage”  is  still  an  important  predictor  of  “current,”  while  “site”  is  not.  The  mean 
currents  at  different  sites  are  thus  not  significantly  different  from  each  other,  so  we  can  use  a  simpler  model  without  the  fixed 
effect  “site.”  We  should  still,  however,  consider  a  random-effects  model,  because  ignoring  the  subject  variation  may  lead  to 
incorrect  standard  error  estimates  of  fixed  effects  or  false  significant  tests. 

Up  to  this  point,  we  examined  primarily  the  similarities  between  GLM  and  MIXED.  MIXED,  in  fact,  has  a  much  more  flexible  way 
of  modeling  random  effects.  Using  the  SUBJECT  and  COVTYPE  options,  Example  9  presents  an  equivalent  form  of  Example  7. 

Example  9:  Fitting  random  effect^subject  interaction  using  SUBJECT  specification 

Command  syntax: 

MIXED  CURRENT  BY  SITE  WITH  VOLTAGE 
/FIXED  SITE  VOLTAGE  |SSTYPE(3) 

/RANDOM  SITE  |  SUBjECT(WAFER)  COVTYPE(ID). 
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The  SUBJECT  option  tells  MIXED  that  each  subject  will  have  its  own  set  of  random  parameters  for  the  random  effect  “site.’ 
The  COVTYPE  option  will  specify  the  form  of  the  variance  covariance  matrix  of  the  random  parameters  within  one  subject. 
The  command  syntax  attempts  to  specify  the  distributional  assumption  in  a  multivariate  form,  which  can  be  written  as: 


Under  normality,  this  assumption  is  equivalent  to  that  in  Example  7.  One  advantage  of  the  multivariate  form  is  that  you  can  easily 
specify  other  covariance  structures  by  using  the  COVTYPE  option.  The  flexibility  in  specifying  covariance  structures  helps  us  to 
fit  a  model  that  better  describes  the  data.  If,  for  example,  we  believe  that  the  variances  of  different  sites  are  different,  we  can 
specify  a  diagonal  matrix  as  covariance  type  and  the  assumption  becomes: 


The  result  of  fitting  the  same  model  using  this  assumption  is  given  in  Example  10. 


Example  10:  Using  COVTYPE  in  a  random-effects  model 


Command  syntax: 


MIXED  CURRENT  BY  SITE  WITH  VOLTAGE 
/FIXED  SITE  VOLTAGE  |SSTYPE(3) 

/RANDOM  SITE  |  SUBjECT(WAFER)  COVTYPE(DIAG) 
/PRINT  GTESTCOV. 


Output: 


Type  III  Tests  of  Fixed  Effects  ® 


Source 

Numerator  df 

Denominator 

df 

F 

Sio. 

Intercept 

1 

311313 

0467.974 

.000 

site 

7 

16.310 

1.267 

.325 

voltage 

1 

319.000 

6639.444 

.000 

Dependent  Variable:  current. 


Figure  34 
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Estimates  of  Covariance  Parameters  * 


Parameter 

Estimate 

Std. 

Error 

WaldZ 

Sia. 

95%  Cor 
Intel 

ifidence 

val 

Lower 

Bound 

Upper 

Bound 

Residual 

.155 

.012 

12.629 

.000 

.133 

.182 

site  Var:[site=l] 

.136 

.079 

1.726 

.084 

.044 

.424 

[subject  var:  [site=2] 

.096 

.060 

1.599 

.110 

.028 

.326 

-  «afer]  ^3,.  [31^3.3] 

.183 

.101 

1.812 

.070 

.062 

.539 

Var:  [site»4] 

.119 

.071 

1.681 

.093 

.037 

.382 

Var:  [site- 5] 

.071 

.048 

1.475 

.140 

.019 

.269 

Var:  [site=6] 

.073 

.049 

1.484 

.138 

.019 

.272 

Var:  [site=7] 

.030 

.029 

1.046 

.296 

.005 

.198 

Var:  [site-8] 

.120 

.071 

1.685 

.092 

.038 

.385 

Dependent  Variable:  current. 


Figure  35 


In  Example  10,  we  request  one  extra  table,  the 
estimated  covariance  matrix  of  the  random 
effect  “site.”  It  is  an  eight-by-eight  diagonal 
matrix  in  this  case.  Note  that  changing  the 
covariance  structure  of  a  random  effect  also 
changes  the  estimates  and  tests  of  fixed 
effects.  We  want,  in  practice,  an  objective 
method  to  select  suitable  covariance  struc¬ 
tures  for  our  random  effects.  In  the  section 
“Covariance  Structure  Selection,”  we  revisit 
examples  9  and  10  to  show  how  to  select 
covariance  structures  for  random  effects. 


Random  Effect  Covariance  Structure  (G>  ^ 


[sitt-l  ] 

[4rw-2]  1 

(siie-4]  1 

[4rlt-S]  1 

(sitt-7]  1 

[sit«<11 1  waf«f 

.136 

-OOO 

,000 

.000 

,000 

.000 

.ooo 

,000 

[siu-2]  1  war«i 

.000 

.OSS 

,000 

.000 

,000 

.000 

.ooo 

,O00 

[slt««3]  1  waf«i 

.000 

.000 

.133 

.000 

.000 

.ooo 

.ooo 

.000 

I 

.000 

.000 

.000 

.110 

.000 

.ooo 

.ooo 

.ooo 

[sit«*5]  1  waf«f 

.000 

.000 

,000 

.ooo 

,071 

.000 

.000 

,000 

1 

.000 

.000 

.000 

.000 

.000 

.073 

.000 

.000 

1  war«f 

.000 

.000 

.000 

.000 

.000 

.000 

.030 

.000 

[5ite-a]  1  waf^f 

.000 

.000 

.000 

.ooo 

.000 

.000 

.ooo 

,120 

Dependent  variable:  current. 


Figure  36 


Multilevel  analysis 

The  use  of  the  SUBJECT  and  COVTYPE  options 
in  /RANDOM  and  /REPEATED  brings  many 
options  for  modeling  the  covariance  structures 
of  random  effects  and  residual  errors.  It  is 
particularly  useful  when  modeling  data 
obtained  from  a  hierarchy.  Example  11 
illustrates  the  simultaneous  use  of  these 
options  in  a  multilevel  model.  We  selected 
data  from  six  schools  from  the  junior  School 
Project  of  Mortimore,  et  al.  (1988).  We  investi¬ 
gate  below  how  the  socioeconomic  status  (SES) 
of  a  student  affects  his  or  her  math  scores  over 
a  three-year  period. 


Example  1 1:  Multilevel  mixed -effects  model 


Command  syntax: 


MIXED  MATHTEST  BY  SCHOOL  CLASS  STUDENT  GENDER  SES  SCHLYEAR 

/FIXED  GENDER  SES  SCHLYEAR  SCHOOL 

/RANDOM  SES  |SUBjECT(SCHOOL*CLASS)  COVTYPE(ID) 

/RANDOM  SES  |SUBjECT(SCHOOL*CLASS*STUDEN-0  COVTYPE(ID) 
/REPEATED  SCHLYEAR  |  SUBjECT(SCHOOL*CLASS*STUDENT)  C0VTYPE(AR1) 
/PRINT  SOLUTION  TESTCOV. 


Output: 


Type  III  Tests  of  Fixed  Effects  * 


Source 

Numerator  df 

Denominator 

df 

F 

Sia, 

Intercept 

1 

15.332 

1 076.489 

.000 

gender 

1 

134.839 

.979 

.324 

ses 

2 

16.815 

3.888 

.041 

schlyear 

2 

202.538 

55.376 

.000 

school 

5 

13.120 

.872 

.525 

Dependent  Variable:  Math  test. 


Figure  37 
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Estimates  of  Fixed  Effects  ^ 


Parameter 

Estimate 

Std, 

Error 

df 

t 

Sio. 

95%  Cor 
Intel 

ifidence 

val 

Lower 

Bound 

Upper 

Bound 

Intercept 

29.097 

2.184 

20.036 

13.324 

.000 

24.542 

33.652 

[gender=0] 

-1 .026 

1.037 

1  34.839 

-,989 

.324 

-3.077 

1.025 

[gender- 1  ] 

.000^ 

.000 

[ses=1,00] 

5.803 

2.331 

21*478 

2,490 

.021 

.963 

10.644 

[se3=2.00] 

.304 

1.782 

1  3.877 

.170 

.867 

-3.522 

4.129 

[ses=3.00] 

.000^ 

.000 

. 

* 

[schlyearsO] 

-4.377 

.457 

1 1  8,1 1  6 

-9,575 

.000 

-5,282 

-8.471 

[schlyear=l  ] 

-4.126 

.468 

^1  9.557 

-8.825 

.000 

-5.047 

-3.204 

[schlyeari=Z] 

,000* 

.000 

. 

, 

[school=1  ] 

-2.751 

2.405 

1  2.873 

-1,144 

.274 

-7,952 

2.450 

[school =2] 

-.784 

2.865 

18,557 

-,274 

.787 

-6,792 

5.223 

[school=3] 

2,269 

2.645 

14,518 

.858 

.405 

-3*385 

7,923 

[school =4] 

-1.911 

2.811 

9.329 

-.680 

.513 

-8,236 

4.415 

[school =5] 

-.686 

2.545 

15*575 

,270 

.791 

-6,092 

4.720 

[school=6] 

.000* 

.000 

- 

This  parameter  is  sec  to  zero  because  it  is  redundant. 
Dependent  Variable:  Math  test. 


Figure  38 


Estimates  of  Covariance  Parameters  ® 


Parameter 

Estimate 

Std, 

Error 

Wald  Z 

Sio. 

95%  Confidence 
Interval 

Lower 

Bound 

Upper 

Bound 

Repeated  Measures  AR1  diagonal 

AR1  rho 

ses  [subject  =  school  Variance 
ses  [subject  =  school  Variance 

1  2*686 
-.027 
6*450 

30*409 

1.667 

.142 

4*991 

4.782 

7*609 

-.190 

1*292 

6*358 

.000 

.650 

.196 

.000 

9*805 

-.296 

1*415 

22*342 

16.413 

.246 

29.391 

41 .387 

Dependent  Variable:  Math  test* 


Figure  39 


In  Example  11,  the  goal  is  to  discover  whether  socioeconomic  status  (“ses”)  is  an  important  predictor  for  mathematics 
achievement  (“mathtest”).  To  do  so,  we  use  the  factor  “ses”  as  a  fixed  effect.  We  also  want  to  adjust  for  the  possible 
sampling  variation  due  to  different  classes  and  students.  “Ses”  is  therefore  also  used  twice  as  a  random  effect.  The  first 
random  effect  tries  to  adjust  for  the  variation  of  the  “ses”  effect  owing  to  class  variation.  In  order  to  identify  all  classes  in 
the  dataset,  school*class  is  specified  in  the  SUBJECT  option.  The  second  random  effect  also  tries  to  adjust  for  the  variation 
of  the  “ses”  effect  owing  to  student  variation.  The  subject  specification  is  thus  school*class*student.  All  of  the  students  are 
followed  for  three  years;  the  school  year  (“schlyear”)  is  therefore  used  as  a  fixed  effect  to  adjust  for  possible  trends  in  this 
period.  The  /REPEATED  subcommand  is  also  used  to  model  the  possible  correlation  of  the  residual  errors  within  each  student. 

We  have  a  relatively  small  dataset.  Since  there  are  only  six  schools,  we  can  only  use  school  as  a  fixed  effect  while  adjusting 
for  possible  differences  between  schools.  In  this  example,  there  is  only  one  random  effect  at  each  level.  With  SPSS  11.5  or 
later,  you  can  specify  more  than  one  random  effect  in  MIXED.  If  multiple  random  effects  are  specified  on  the  same  RANDOM 
subcommand,  you  can  model  their  correlation  by  using  a  suitable  COVTYPE  specification.  If  the  random  effects  are  specified 
on  separate  RANDOM  subcommands,  they  are  assumed  to  be  independent. 
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In  the  Type  III  tests  of  fixed  effects,  in  Example  11,  we  see  that  socioeconomic  status  does  impact  student  performance.  The 
parameter  estimates  of  “ses”  for  students  with  “ses=l”  (fathers  have  managerial  or  professional  occupations)  indicate  that 
these  students  perform  better  than  students  at  other  socioeconomic  levels.  The  effect  “schlyear”  is  also  significant  in  the 
model  and  the  students’  performances  increase  with  “schlyear.” 

From  “Estimates  of  Covariance  Parameters”  (Figure  39),  we  notice  that  the  estimate  of  the  “ARl  rho”  parameter  is  not 
significant,  which  means  that  a  simple,  scaled-identity  structure  may  be  used.  For  the  variation  of  “ses”  due  to  school* 
class,  the  estimate  is  very  small  compared  to  other  sources  of  variance  and  the  Wald  test  indicates  that  it  is  not  significant. 

We  can  therefore  consider  removing  the  random  effect  from  the  model. 

We  see  from  this  example  that  the  major  advantages  of  MIXED  are  that  it  is  able  to  look  at  different  aspects  of  a  dataset 
simultaneously  and  that  all  of  the  statistics  are  already  adjusted  for  all  effects  in  the  model.  Without  MIXED,  we  must  use 
different  tools  to  study  different  aspects  of  the  models.  An  example  of  this  is  using  GLM  to  study  the  fixed  effects  and 
using  VARCOMP  to  study  the  covariance  structure.  This  is  not  only  time  consuming,  but  the  assumptions  behind  the  statistics 
are  usually  violated. 

Custom  hypothesis  tests 

Apart  from  predefined  statistics,  MIXED  allows  users  to  construct  custom  hypotheses  on  fixed-  and  random-effects  parameters 
through  the  use  of  the  /TEST  subcommand.  To  illustrate,  we  use  a  dataset  from  Pinheiro  and  Bates  (2000).  The  data  consist 
of  a  CT  scan  on  a  sample  often  dogs.  The  dogs’  left  and  right  lymph  nodes  were  scanned  and  the  intensity  of  each  scan  was 
recorded  in  the  variable  pixel.  The  following  mixed-model  command  syntax  tests  whether  there  is  a  difference  between  the  left 
and  right  lymph  nodes. 

Example  12:  Custom  hypothesis  testing  in  mixed -effects  model 

Command  syntax: 

MIXED  PIXEL  BY  SIDE 
/FIXED  SIDE 

/RANDOM  SIDE  |  SUBjECT(DOG)  COVTYPE(UN) 

/TEST(O)  ‘Side  (fixed)’  SIDE  1  -1 

/TEST(O)  ‘Side  (random)’  SIDE  1  -1  |  SIDE  1  -1 

/PRINT  LMATRIX. 

Output: 


Contrast  Coefficients  ^ 


LI 

Fixed 

Intercept 

0 

Effects 

[side=L] 

1 

tside=R] 

-1 

Random 

[side=L]  1  dog 

0 

Effects 

[side^R]  1  dog 

0 

Side  (fixed) 


Figure  40 
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Contrast  Estimates 


95%  Confidence 

Intpry^l 

Contrast 

Estimate 

Std.  Error 

df 

Test  Value 

t 

Sio. 

Lower 

Bound 

Upper 

Bound 

LI 

8-S02 

7.337 

7.898 

.000 

1*159 

*280 

-8.454 

25,458 

Side  (fixed) 

Dependent  Variable:  pixel. 


Figure  41 


Contrast  Estimates 


Contrast 

Estimate 

Std.  Error 

df 

Test  Value 

t 

Sio. 

95%  Con 
Intec 

Lower 

Bound 

fidence 

jial 

Upper 

Bound 

LI 

8.502 

3*205 

86.681 

,000 

2,653 

*009 

2*131 

14*873 

Side  (random) 

Dependent  Variable:  pixel* 


Figure  42 


The  output  of  the  two  /TEST  subcommands  is  shown  above.  The  first  test  looks  at  differences  in  the  left  and  right  sides  in  the 
general  population  (broad  inference  space).  We  should  use  the  second  test  to  test  the  differences  between  the  left  and  right 
sides  for  the  sample  of  dogs  used  in  this  particular  study  (narrow  inference  space),  in  the  second  test,  the  average  differences 
of  the  random  effects  over  the  ten  dogs  are  added  to  the  statistics.  MIXED  automatically  calculates  the  average  over  subjects. 
Note  that  the  contrast  coefficients  for  random  effects  are  scaled  by  one/(number  of  subjects).  Though  the  average  difference 
for  the  random  effect  is  zero,  it  affects  the  standard  error  of  the  statistic.  We  see  that  statistics  of  the  two  tests  are  the  same, 
but  the  second  has  a  smaller  standard  error.  This  means  that  if  we  make  an  inference  on  a  larger  population,  there  will  be 
more  uncertainty.  This  is  reflected  in  the  larger  standard  error  of  the  test.  The  hypothesis  in  this  example  is  not  significant  in 
the  general  population,  but  it  is  significant  for  the  narrow  inference.  A  larger  sample  size  is  therefore  often  needed  to  test  a 
hypothesis  about  the  general  population. 

Covariance  structure  selection 

in  examples  3  and  11,  we  see  the  use  of  Wald  statistics  in  covariance  structure  selection.  Another  approach  to  testing  hypotheses 
on  covariance  parameters  uses  likelihood  ratio  tests.  The  statistics  are  constructed  by  taking  the  differences  of  the  -2  Log 
likelihoods  of  two  nested  models.  Under  the  null  hypothesis  that  the  covariance  parameters  are  0  in  the  population,  this 
difference  follows  a  chi-squared  distribution  with  degrees  of  freedom  equal  to  the  difference  in  the  number  of  parameters 
of  the  models. 

To  illustrate  the  use  of  the  likelihood  ratio  test,  we  again  look  at  the  model  in  examples  9  and  10.  In  Example  9,  we  use  a 
scaled  identity  as  the  covariance  matrix  of  the  random  effect  “site.”  in  Example  10,  however,  we  use  a  diagonal  matrix  with 
unequal  diagonal  elements.  Our  goal  is  to  discover  which  model  better  fits  the  data.  We  obtain  the  -2  Log  likelihood  values 
and  other  criteria  about  the  two  models  from  the  information  criteria  tables  shown  on  the  next  page. 


19 


Linear  Mixed-Effects  Modeling  in  SPSS 


Information  criteria  for  Example  9 


Information  Criteria  * 


-2  Restricted  Log  Likelihood 

523.532 

Akaike’s  Information  Criterion  (AlC) 

527.532 

Hurvich  and  Tsai's  Criterion  (AlCC) 

527.563 

Bozdogan's  Criterion  (CAIC) 

537.469 

Schwarz's  Bayesian  Criterion  (BIC) 

535.469 

The  information  criteria  are  displayed  in 
smaller-is-better  forms. 


Dependent  Variable:  current. 

Figure  43 


Information  Criteria  * 


-2  Restricted  Log  Likelihood 

519.290 

Akaike's  Information  Criterion  (AlC) 

537.290 

Hurvich  and  Tsai's  Criterion  (AlCC) 

537.763 

Bozdogan's  Criterion  (CAIC) 

582.009 

Schwarz’s  Bayesian  Criterion  (BIC) 

573.009 

The  information  criteria  are  displayed  in 
smaller-is-better  forms. 


Dependent  Variable:  current. 

Figure  44 


The  likelihood  ratio  test  statistic  for  testing  Example  9  (null  hypothesis)  versus  Example  10  is  523.532  -  519.290  =  4.242. 
This  statistic  has  a  chi-squared  distribution  and  the  degrees  of  freedom  are  determined  by  the  difference  (seven)  in  the 
number  of  parameters  in  the  two  models.  The  p-value  of  this  statistic  is  0.752,  which  is  not  significant  at  the  0.05  level. 

The  likelihood  ratio  test  indicates,  therefore,  that  we  may  use  the  simpler  model  in  Example  9. 

Apart  from  Wald  statistics  and  likelihood  ratio  tests,  we  can  also  use  such  information  criteria  as  Akaike’s  Information 
Criterion  (AlC)  and  Schwarz’s  Bayesian  Criterion  (BIC)  to  search  for  the  best  model. 

Random  coefficient  models 

In  many  situations,  it  is  impossible  to  use  a  single  regression  line  to  describe  the  behavior  of  every  individual.  To  account  for 
possible  variations  between  individuals,  we  can  treat  the  regression  coefficients  as  random  variables.  This  type  of  model  is 
therefore  called  the  random  coefficient  model.  We  typically  assume  that  the  regression  coefficients  have  normal  distributions. 
Here  we  have  a  dataset  that  was  used  by  Willet  (1988)  and  Singer  (1998)  as  illustration.  The  data  are  the  performances  of 
35  individuals  in  an  opposite-naming  task  on  four  consecutive  occasions.  The  performance  profiles  of  the  35  individuals  are 
shown  in  the  following  graph. 


Figure  45 
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We  can  see  that  most  individuals  exhibit  an  increasing  trend  over  time.  Since  a  single  regression  line  will  not  fit  all  of  them,  it 
makes  sense  to  use  a  random  coefficient  model.  If  we  restrict  ourselves  to  linear  models,  there  are  three  possible  model  types: 

■  Random  intercept 

■  Random  slopes 

■  Random  intercept  and  slopes 

Random  intercept  models 

As  the  name  suggests,  random  intercept  models  assume  that  each  individual  has  a  different  intercept.  In  this  model,  we 
assume  that  the  intercepts  have  an  iid  normal  distribution  with  a  mean  of  zero  and  some  unknown  variance. 

Example  13:  Random  intercept  models 

Command  syntax: 

MIXED  Y  WITH  TIME 
/FIXED  INTERCEPT  TIME 

/RANDOM  INTERCEPT  I  SUBJECT(ID)  COVTYPE(ID) 

/PRINT  SOLUTION  TESTCOV. 


Output: 


Estimates  of  Fixed  Effects  * 


Parameter 

Estimate 

Std.  Error 

df 

t 

SiQ. 

95%  Cor 

Inte 

ifidence 

val 

Lower 

Bound 

Upper 

Bound 

Intercept 

time 

164.374 

26.960 

5.777 

1.466 

46.060 

104.000 

28.455 

18.395 

.000 

.000 

1 52.747 

24.054 

176.002 

29.866 

Dependent  Variable:  Performance. 


Figure  46 


Estimates  of  Covariance  Parameters  * 


Parameter 

Estimate 

Std.  Error 

WaldZ 

Si  a 

95%  Con 
Inter 

fidence 

^al 

Lower 

Bound 

Upper 

Bound 

Residual 

Intercept  [subject  =  id  Variance 

375.901 

904.805 

52.1  2S 
242.590 

7.211 

3.730 

.000 

.000 

286.440 

534.979 

493.303 

530.289 

Dependent  Variable:  Performance. 


Figure  47 
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The  coefficients  you  see  in  the  Estimates  of  Fixed  Effects  table  are  the  estimated  population  regression  line.  Since  we  are 
using  a  random  intercept  model,  MIXED  automatically  estimates  the  variance  of  the  random  intercepts.  The  estimates  are 
found  in  the  Estimates  of  Covariance  Parameters  table.  The  estimated  variance  of  the  intercept  is  about  904.805,  which 
suggests  that  different  individuals  have  different  intercepts. 


-  1 

-  2  2S 

3  — 

- 4  —  7T 

S  33 

—  g  —  29 

—  7  30 

3  —  31 

—  .  S  —  32 

—  10  —  33 

— ^11  —  34 

13  35 

—  13 

14 

—  15 

—  K 

—  17 

—  13 

15 

—  20 

21 

—  22 
—  23 


Figure  48 


Random  slopes  models 

Analogous  to  a  random  intercept  model,  a  random  slopes  model  assumes  that  each  individual  has  a  different  slope.  In  this 
model,  we  assume  that  the  slopes  have  an  iid  normal  distribution  with  a  mean  of  zero  and  an  unknown  variance. 


Example  14:  Random  slopes  models 


Command  syntax: 


MIXED  Y  WITH  TIME 

/FIXED  INTERCEPT  TIME 

/RANDOM  TIME  |  SUBJECT(ID)  COVTYPE(ID) 

/PRINT  SOLUTION  TESTCOV. 


Random  intercept,  constant  slope 


Output: 


Estimates  of  Fixed  Effects  ^ 


Parameter 

Estimate 

Std.  Error 

df 

t 

Sio, 

95%  Co 

Intp 

mfidence 

rval 

Lower 

Bound 

Upper 

Bound 

Intercept 

time 

164.374 

26.960 

3.793 

2.941 

104.000 

66.286 

43.336 

9.166 

.000 

.000 

1  56.353 

21.088 

171.896 

32.832 

Dependent  Variable:  Performance. 


Figure  49 
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Estimates  of  Covariance  Parameters  ^ 


Parameter 

Estimate 

Std.  Error 

WaldZ 

Sia. 

95%  Co 

Inte 

nfidence 

rval 

Lower 

Bound 

Upper 

Bound 

Residua) 

time  [subject  =  id]  Variance 

719.345 

158.946 

99.755 

51.507 

7.211 

3.086 

.000 

.002 

548.147 

84.320 

944.013 

299.975 

Dependent  Variable:  Perfomiance. 


Figure  50 


As  with  the  random  intercepts  model,  MIXED  provides  the  estimated  population  regression  line  in  the  Estimates  of  Fixed 
Effects  table,  and  the  variance  of  the  random  slopes  in  the  Estimates  of  Covariance  Parameters  table.  The  estimated  variance 
of  the  random  slopes  is  158.946,  which  is  highly  significant. 


Constant  interoopt,  random  slopes 


^  250-' 


T - 1 - 1 - r 

□  12  3 

Tima 


€ 


-  9 

—  w 

—  11 

\2 

-  13 

14 

—  15 

^  le 
-  17 

—  18 
19 

—  30 
21 

-  22 

—  23 


24 

—  2S 

—  2$ 

—  27 
23 

—  29 
30 

—  31 
-  32 

—  33 
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Figure  51 


In  comparing  the  predicted  profile  plots  in  examples  13  and  14  to  the  observed  profile  plots,  we  notice  that  neither  the 
random  intercepts  nor  the  random  slopes  model  can  completely  explain  the  variations  in  the  data.  We  therefore  need  to 
consider  a  more  complicated  model  that  has  both  random  intercepts  and  random  slopes. 

Random  intercepts  and  slopes  models 

When  both  intercepts  and  slopes  are  random,  MIXED  has  more  flexibility  in  modeling  the  data.  In  this  model,  pairs  of 
intercepts  and  slopes  are  assumed  to  have  iid  bivariate  normal  distribution  with  a  mean  of  zero  and  some  unknown 
covariance  matrix. 
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Example  15:  Random  intercepts  and  slopes  models 


Command  syntax: 


MIXED  Y  WITH  TIME 
/FIXED  INTERCEPT  TIME 

/RANDOM  INTERCEPT  TIME  |  SUBJECT(ID)  COVTYPE(UN) 
/PRINT  SOLUTION  TESTCOV. 


Output: 


Estimates  of  Covariance  Parameters  * 


Parameter 

Estimate 

Std.  Error 

WaldZ 

Sia 

95%  Co 

Intp 

nfidence 

rval 

Lower 

Bound 

Upper 

Bound 

Residual 

time  [subject  =  id[  Variance 

719,345 

158.946 

99.755 

51.507 

7.211 

3.036 

.000 

.002 

548.147 

84.220 

944,013 

299.975 

Dependent  Variable:  Performance, 


Figure  52 


Estimates  of  Covariance  Parameters  * 


Parameter 

Estimate 

Std.  Error 

WaldZ 

Sia. 

95%  Coi 

Inte 

nfidenco 

rval 

Lower 

Bound 

Upper 

Bound 

Residual 

1  59,477 

26.957 

5.916 

,000 

114.504 

222,115 

Intercept  +  time  UN  (1 ,1 ) 

1 1  98.777 

318.381 

3,765 

.000 

712.310 

:01  7,472 

[subject  =  id]  UN  (2,1) 

-179.256 

88.963 

-2,015 

.044 

353.621 

-4.890 

UN  (2,2) 

1  32,401 

40.211 

3.293 

.001 

73.009 

240.106 

Dependent  Variable:  Performance. 


Figure  53 


In  addition  to  estimating  the  population  regression  line,  MIXED  also  estimates  the  variance  of  the  intercepts,  the  variance  of 
the  slopes,  and  the  covariance  between  the  intercepts  and  the  slopes.  All  of  the  variance  and  covariance  parameters  in  this 
model  are  significant  at  the  0.05  level.  We  can  see  that  the  predicted  profiles  of  the  35  individuals  as  shown  below  match 
the  observed  profile  much  better  than  the  profiles  produced  by  the  previous  two  models. 


Linear  Mixed-Effects  Modeling  in  SPSS 


24 


Figure  54 


Random  Intarcapt,  random  si  op  as 
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If  we  compare  the  AlC  of  the  three  random  coefficient  models,  we  see  that  the  random  intercepts  and  slopes  model  has  the 
smallest  AlC.  It  is  therefore  the  best  model  of  the  three. 


Model 

AlC 

Random  intercept 

1304.340 

Random  slope 

1361.464 

Random  intercept  and  slope 

1274.823 

Figure  55 


Estimated  marginal  means 

Estimated  marginal  means  (EMMEANS)  are  also  known  as  modified  population  marginal  means  or  predicted  means.  In  most 
cases,  they  are  also  the  same  as  least  squares  means,  which  are  group  means  that  are  estimated  from  the  fitted  model.  In 
general,  they  are  preferred  over  observed  means,  which  do  not  account  for  the  underlying  model  of  your  data.  In  SPSS  for 
Windows,  there  are  two  ways  to  compute  EMMEANS.  The  first  method  is  to  spell  out  the  contrast  matrix  directly  and  use 
MIXED’S  /TEST  subcommand  to  compute  them.  This  is  a  laborious  task,  however,  and  prone  to  errors.  The  /EMMEANS 
subcommand  is  therefore  introduced  to  simplify  the  calculations.  To  illustrate,  we  apply  the  method  to  an  example  dataset 
containing  salary  and  demographic  information  for  474  individuals. 

In  the  following  example,  we  fit  a  fixed-effects  model  that  predicts  employee  salary  by  using  gender,  minority  group 
membership,  job  classification,  and  education  as  predictors.  Based  on  the  model,  we  would  like  to  find  the  predicted 
salary  for  each  job  category. 
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Example  16:  EMM  BANS 


Command  syntax: 

MIXED  SALARY  BY  GENDER  MINORITY  JOBCAT  WITH  EDUC 
/FIXED  GENDER  MINORITY  JOBCAT  EDUC 
/PRINT  LMATRIX  SOLUTION 

/EMMEANS  =  TABLESGOBCAT)  COMPARE  ADj(SIDAI<). 


Output: 


Type  III  Tests  of  Fixed  Effects  * 


Source 

Numerator  df 

Denominator 

df 

F 

Sia. 

Intercept 

1 

468.000 

50.475 

.000 

gender 

1 

468.000 

33.730 

.000 

minority 

1 

468.000 

4.053 

.045 

jobcat 

2 

468.000 

1 89.799 

.000 

educ 

1 

468.000 

56.511 

.000 

Dependent  Variable:  Current  Salary. 


Figure  56 


Estimates^ 


Emolovment  Cateaorv 

Mean 

Std.  Error 

df 

Confidence  Interval 

Lower 

Bound 

UbP«r 

Bound 

Clerical 

8599.033“ 

553.351 

468.000  1 

t751 1.673 

E9686.393 

Custodial 

2998.593“ 

1961.688 

468.000 

£9143.787 

^6853.399 

Manager 

5338.089“ 

1  309.287 

468.000  1 

^2765.280 

^79 10.899 

Co  variates  appearing  in  the  model  are  evaluated  at  the  following  values: 
Educational  Level  (years)  =  1  3.49. 

Dependent  Variable:  Current  Salary. 


Figure  57 


All  the  effects  are  significant  at  the  0.05  level,  therefore  it’s  logical  to  try  to  discover  the  mean  salary  of  a  particular 
demographic  group  and  compare  it  to  that  of  other  groups.  The  EMMEANS  subcommand  can  help  to  answer  these  types 
of  questions.  If  you  specify  the  option  TABLESQOBCAT)  on  an  EMMEANS  subcommand,  it  computes  the  predicted  mean 
of  each  job  category  using  the  fitted  model.  In  general,  these  predicted  means  are  different  from  the  observed  cell  means. 

The  output  is  shown  in  the  Estimates  table  (Figure  57).  It  shows  that  managers  have  the  highest  average  salary  ($55,338) 
and  clerks  have  the  lowest  average  salary  ($28,599). 

In  order  to  discover  whether  salaries  in  different  job  categories  are  significantly  different  from  each  other,  you  can  use  the 
COMPARE  option  to  instruct  MIXED  to  perform  all  pairwise  comparisons  among  all  job  categories.  If  you  only  want  to  compare 
categories  to  a  reference  category,  you  can  use  the  optional  keyword  REFCAT  to  specify  the  reference  category.  The  ADj(SIDAK) 
option  will  instruct  MIXED  to  use  the  Sidak  multiple  tests  adjustment  when  calculating  p-values.  The  results  are  shown  in 
the  Pairwise  Comparisons  table  (Figure  58).  The  p-values  suggest  that  all  pairs  are  significant  at  the  0.05  level,  except  the 
comparison  between  the  clerical  group  and  the  custodial  group.  The  COMPARE  option  also  performs  a  univariate  test  to 
discover  whether  the  means  of  all  job  categories  are  equal.  In  this  example,  the  univariate  test’s  p-value  is  less  than  0.05, 
so  we  reject  the  null  hypothesis  of  equal  category  means. 

The  previous  example  is  relatively  simple.  Next,  we  will  illustrate  the  use  of  EMMEANS  in  a  more  sophisticated  model  that  is 
similar  to  Example  11.  The  model  we  are  going  to  use  is  essentially  the  same  as  the  one  used  in  Example  11,  but  with  the 
addition  of  the  GENDER*SES  interaction. 
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Patrwite  CompanUnE** 


Mean 

Cartfitftfii;*  Imurtijl  ftjr 

nirf^f] 

PfW.P 

i'll  Enwfovimnt  CiA«ofv  (J)  Eiiwlavlinflt  CMhdo 

^l■J> 

SttJ.  tmor 

dr 

Lowir  Bound 

LAmr  Bdund 

CIcriHl 

CiKtDdiai 

1999.953 

460,000 

-9191,099 

39fi,71  7 

Ze?39-057* 

1399.S0e 

460,000 

JOOO 

-90069.099 

-29410-001 

Cufudial 

Clench 

43919.561 

1999.953 

460.000 

JO03 

-399,71  7 

9191.039 

WjMBer 

2?i39.496' 

460.000 

jOOO 

-29209.097 

-16909.395 

UHager 

Clencal 

MrsQ.djsr- 

l39fl.2M- 

460.000 

jOOO 

29410.091 

30033.032 

Cxiitodial 

460JOOO 

jOOO 

16939.390 

29209.097 

0(1  MtirUfU^  [Yi^THlirl^l  [Tl*«r4 
*'The  niMn  ■differance  h  amrirfitant  st  the  .05  level. 
*'  AdjLrstiiKht  for  iTHiltiple  eompvBHu:  Sidm. 

^  [J«)e«J(nt  VJrtahle;  Cuirenl  Ssierv, 


Figure  58 


Univariate  Tests  ^ 


Numerator  df 

Denominator 

df 

F 

Sia. 

2 

468.000 

189.799 

*000 

The  F  tests  the  effect  of  Employment  Category.  This  tt 
is  based  on  the  linearly  independent  pairwise 
comoarisons  amono  the  estimated  maroinal  means. 


Dependent  Variable:  Current  Salary* 


Figure  59 


Example  1 7:  EMMEANS 


Command  syntax: 


MIXED  MATHTEST  BY  SCHOOL  CLASS  STUDENT 
GENDER  SES  SCHLYEAR 

/FIXED  GENDER  SES  GENDER*SES  SCHLYEAR  SCHOOL 
/RANDOM  SES  |SUBJECT(SCHOOL*CLASS)  COVTYPE(ID) 

/RANDOM  SES  |SUBJECT(SCHOOL*CLASS*STUDEN-0  COVTYPE(ID) 
/REPEATED  SCHLYEAR  |  SUBJECT(SCHOOL*CLASS*STUDENT) 
C0VTYPE(AR1) 

/PRINT  SOLUTION  TESTCOV 

/EMMEAN  TABLE(SES*GENDER)  COMPARE(SES)  ADJ(SIDAK). 
Output: 


Estimates* 


Socioeconomic 

Status  Gender 

Mean 

Std. 

Error 

df 

95%  Cor 
Inter 

ifidence 

val 

Lower 

Bound 

Upper 

Bound 

1  &  II  Boy 

Girl 

31.287 

30.800 

2.271 

2.088 

44.353 

34.606 

26.710 

26.559 

35.864 

35.040 

III  to  VI  Boy 

Girl 

25.211 

25.558 

1.283 

1.260 

19.171 

17.387 

22.528 

22.904 

27.895 

28.212 

Other  Boy 

Girl 

23.500 

26.898 

1.676 

1.765 

31.892 

37.100 

20.085 

23.322 

26.915 

30.473 

Dependent  Variable:  Math  test. 


Figure  60 


Pairwise  Comparisons  ^ 


Gender 

(1)  Socipeconomic 
Status 

(J)  Sccioeoorofnic 
Status 

Mean 

Difference 

fl-JI 

SCI 

Error 

df 

Sie.^ 

95%  Con 
Interva 
nirfflr^ 

fidence 

il  for 
nrf7 

Lower 

Bound 

Upper 

Bound 

eoy 

1  &  II 

III  te  VI 

S.076 

2.614 

37.1 8S 

.075 

’.458 

12.610 

Other 

7.788* 

2.867 

40.812 

.029 

.650 

14.925 

III  tPVI 

I&  II 

-6.076 

2.614 

37.1 9S 

.075 

-12.610 

.458 

Other 

1.71  ^ 

2.099 

26.131 

.807 

-3.642 

7.065 

Other 

l&  II 

-7,7S8* 

2.867 

40.812 

,029 

-14.925 

-.650 

III  to  VI 

-1.712 

2.099 

26.131 

.807 

-7.065 

3.642 

Girl 

1  &  II 

III  to  VI 

5.242 

2.440 

29.808 

,115 

‘.930 

11,414 

Other 

3.002 

2.7S3 

37,687 

.426 

-3.051 

10.854 

III  tPVI 

I&  II 

-5.242 

2.440 

29.  SOS 

.115 

-11.414 

.930 

Other 

-1.340 

2.172 

29.1  59 

.904 

-6.842 

4.162 

Other 

I&  II 

-3.902 

2.763 

37.687 

.426 

-10.854 

3. 051 

III  to  VI 

1.340 

2.172 

29.1  59 

,904 

-4.1  62 

6,842 

Based  on  estirnated  martBnal  means 
*'  The  difference  6  siqnificant  at  the  .05  leve!. 
3-  Adjustmert  fgr  multiple  comparisonK  Sidak, 
Dependent  Variable:  Math  test. 


Figure  61 
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Univariate  Tests  * 


Gender 

Numerator  df 

Denominator 

df 

F 

Sicj, 

Boy 

2 

31.715 

3.848 

.032 

Girl 

2 

30.831 

2.308 

.116 

Each  F  tests  the  simple  effects  of  Sodoeconomic  Status  within  ei 
level  combination  of  the  other  effects  shown-  These  tests  are  baj 
on  the  linearly  independent  pairwise  comparisons  among  the 
estimated  maroinal  means- 

Dependent  Variable:  Math  test- 


Figure  62 


The  command  syntax  in  Example  17  requests  the  predicted 
means  of  all  gender-socioeconomic  status  combinations. 
Since  this  model  involves  random  effects,  the  predicted 
means  are  computed  by  averaging  the  random  effects  over 
subjects.  The  predicted  means  of  the  six  gender-socioeco¬ 
nomic  status  combinations  are  shown  in  the  Estimates 
table  (Figure  60).  Among  these  simple  effects  comparisons, 
only  Socioeconomic  Status  I  &  II  and  Socioeconomic  Status 
Other/Boy  are  significantly  different  from  each  other,  with 
p-value  0.029. 


The  COIVIPARECSES)  option  in  Example  17  indicates  that  we  want  to  perform  a  univariate  test  to  determine  whether  the  means 
of  socioeconomic  status  are  the  same  within  each  gender.  The  results  (see  Figure  62)  indicate  that  the  means  of  socioeconomic 
status  among  boys  are  significant  at  0.05  level  but  not  among  girls.  This  agrees  with  the  Pairwise  Comparisons  table  (Figure  61). 
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